The Hazards in Segmentation of Handwritten Hindi Text

نویسندگان

  • Naresh Kumar Garg
  • Lakhwinder Kaur
  • M. K. Jindal
چکیده

Optical Character Recognition (OCR) is a process to recognize the handwritten or printed scanned text with the help of a computer. Segmentation is very important stage of any text recognition system. The problems in segmentation can lead to decrease in segmentation rate and hence recognition rate. A good segmentation technique can improve the recognition rate. This paper deals with the hazards that occur in segmentation of handwritten Hindi text. We also explained the main reasons for some of these problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Structural Approach for Segmentation of Handwritten Hindi Text

This paper makes an attempt to segment the handwritten Hindi words. The problem of segmentation is compounded by the possible presence of modifiers (matras) on all sides of the basic characters and due to the uncertainty introduced in the character shapes by way of different writing styles. We have devised a structural approach to capture the similarities and differences between structure class...

متن کامل

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

Language identification for handwritten document images using a shape codebook

Language identification for handwritten document images is an open document analysis problem. In this paper, we propose a novel approach to language identification for documents containing mixture of handwritten and machine printed text using image descriptors constructed from a codebook of shape features. We encode local text structures using scale and rotation invariant codewords, each repres...

متن کامل

Distinction between Machine Printed Text and Handwritten Text in a Document

In many documents machine printed& handwritten texts are intermixed .Optical Character Recognition (OCR) techniques are different for machine printed and handwritten text, so it is necessary to separate these text before giving input to the OCR. In this paper we are proposing methodology for Hindi language. This methodology is based on structural features of text. Experimental results on a data...

متن کامل

Recognition of Handwritten Devanagari Words Using Neural Network

Handwritten Word Recognition is an important problem of Pattern Recognition. Online handwritten recognition system for Devanagari words is still in developing stage and becoming challenging due to the large complexity involvement. In India, more than 300 million people use Devanagari script for documentation. There has been a significant improvement in the research related to the recognition of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011